Scalable fault-containing self-stabilization in dynamic networks

نویسنده

  • Sven Köhler
چکیده

Self-stabilizing distributed systems provide a high degree of non-masking fault-tolerance. They recover from transient faults of any scale or nature without human intervention. In general, however, the time needed to recover from small-scale transient faults may not differ significantly from the time needed to recover from large-scale transient faults. Bounding the impact of small-scale faults has been pursued with two independent objectives: reducing the time needed to recover from state corruptions, e.g., fault-containment, and optimizing the system’s reaction upon topological changes, e.g., super-stabilization. The objective of fault-containing self-stabilization is to limit the effects of the corruption of a single node’s state to a local area, i.e., to contain the fault, and to ensure that correctness of the output is regained within constant time. Transformations that add the property of fault-containment to any silent self-stabilizing algorithm exist. However, their fault-gap, i.e., the time needed to prepare for the containment of another fault, is linear in the number of nodes. Hence, these transformations do not perform well in large networks. The root cause of this is the use of global synchronization and reset. This thesis presents a novel scheme for local synchronization. Based on it, a new transformation for adding fault-containment to silent self-stabilizing algorithms is developed. Its fault-gap and slowdown are constant. These are major improvements over previous solutions. The effects of a state corruption are strictly limited to the 2-hop neighborhood of the fault. Similar to previous work, the transformation creates backups of a node’s local state to detect and revert state corruptions. However, the number of backups per node is reduced from O(∆) to 2 backups per node. In order to balance the number of backups stored by each node, this thesis presents a self-stabilizing algorithm for computing a placement of backups such that the standard deviation of the number of backups stored per node assumes a local minimum. Super-stabilizing algorithms not only are self-stabilizing, but also guarantee that a safety property is satisfied while the system recovers from a topology change. This thesis introduces the notion of fault-containing superstabilization and presents a transformation that can be used to add faultcontainment to any silent super-stabilizing algorithm. Fault-containing super-stabilizing distributed algorithms are fault-containing, super-stabilizing, and guarantee that the safety property is satisfied within constant time even if a corruption of a single node’s state and a topology change occur at the same time. The transformations and algorithms presented in this thesis all work under the most general model used in self-stabilization research: the unfair distributed scheduler. Their correctness is proven using a new technique called serialization which is introduced in this thesis. It is based on the observation that it is possible to replace the parallel execution of a distributed algorithm with a sequential execution, provided the algorithm satisfies a particular condition.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Self-stabilization versus Robust Self-stabilization for Clustering in Ad-Hoc Network

In this paper, we compare the two fault tolerant approaches: self-stabilization and robust self-stabilization, and we investigate their performances in dynamic networks. We study the behavior of four clustering protocols; two self-stabilizing GDMAC and BSC, and their robust self-stabilizing version R-GDMAC and R-BSC. The performances of protocols are compared in terms of their cluster-heads num...

متن کامل

CATS: Linearizability and Partition Tolerance in Scalable and Self-Organizing Key-Value Stores

Distributed key-value stores provide scalable, fault-tolerant, and selforganizing storage services, but fall short of guaranteeing linearizable consistency in partially synchronous, lossy, partitionable, and dynamic networks, when data is distributed and replicated automatically by the principle of consistent hashing. This paper introduces consistent quorums as a solution for achieving atomic c...

متن کامل

Superstabilizing, Fault-Containing Distributed Combinatorial Optimization

Self stabilization in distributed systems is the ability of a system to respond to transient failures by eventually reaching a legal state, and maintaining it afterwards. This makes such systems particularly interesting because they can tolerate faults, and are able to cope with dynamic environments. We propose the first self stabilizing mechanism for multiagent combinatorial optimization, whic...

متن کامل

Self-Stabilization Workshop

The DSN Workshop on Self-Stabilization’s programme includes fifteen research presentations. The main areas in the programme are network protocols, sensor networks, distributed algorithms, methods for analysis of self-stabilization, distributed system fault tolerance, and techniques used in the construction of systems that self-

متن کامل

Dynamic configuration and collaborative scheduling in supply chains based on scalable multi-agent architecture

Due to diversified and frequently changing demands from customers, technological advances and global competition, manufacturers rely on collaboration with their business partners to share costs, risks and expertise. How to take advantage of advancement of technologies to effectively support operations and create competitive advantage is critical for manufacturers to survive. To respond to these...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014